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Abstract. We consider solution of multiply shifted systems of nonsymmetric linear equations, 
possibly also with multiple right-hand sides. First, for a single right-hand side, the matrix is shifted 
by several multiples of the identity. Such problems arise in a number of applications, including 
lattice quantum chromodynamics where the matrices are complex and non-Hermitian. Some Krylov 
iterative methods such as GMRES and BiCGStab have been used to solve multiply shifted systems 
for about the cost of solving just one system. Restarted GMRES can be improved by deflating 
eigenvalues for matrices that have a few small eigenvalues. We show that a particular deflated 
method, GMRES-DR, can be applied to multiply shifted systems. 

In quantum chromodynamics, it is common to have multiple right-hand sides with multiple shifts 
for each right-hand side. We develop a method that efficiently solves the multiple right-hand sides by 
using a deflated version of GMRES and yet keeps costs for all of the multiply shifted systems close 
to those for one shift. An example is given showing this can be extremely effective with a quantum 
chromodynamics matrix. 

Key words, linear equations, GMRES, BiCG, deflation, eigenvalues, QCD, multiple shifts, 
multiple right-hand sides 
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1. Introduction. We consider the iterative solution of a large system of linear 
equations that not only has multiple right-hand sides, but also has multiple shifts 
for each right-hand side. Let nrhs be the number of right-hand sides and ns be the 
number of shifts. Then the problem is 



with j = 1, . . . , nrhs and i = 1, . . . , ns. Here A is a large matrix which may be 
nonsymmetric or complex non-Hermitian. We assume there is no preconditioning. 
The shift cti will be referred to as the base shift. For the rest of the paper, we will 
leave off the superscripts when it is clear which right-hand side is being dealt with. 

Systems with multiple right-hand sides occur in many applications (see, for exam- 
ple, [17] for some applications). There are several applications that need solution of 
multiply shifted systems, for example, control theory [5], and time-dependent differen- 
tial equations and frequency response computations fl6'. Some important problems 
in lattice quantum chromodynamics (lattice QCD) have both multiple right-hand 
sides and multiple shifts. For example, the Wilson-Dirac formulation 7J and over- 
lap fermion [33', [T] computations both lead to such problems. Very large complex 
non-Hermitian matrices are needed. For Wilson-Dirac matrices, the right-hand sides 
represent different noise vectors. The shifts correspond to different quark masses that 
are used in an extrapolation. 
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A standard way to solve systems with multiple right-hand sides is to use a block 
approach [371 However, block methods can be costly due to the orthogonaHza- 
tion expense, and also storage requirements can be prohibitive. Block methods are 
not generally used in lattice QCD. Even if multiple right-hand sides are solved se- 
quentially instead of together in a block, it is important to take advantage of the 
fact that several systems share the same matrix. Information developed during the 
solution of one right-hand side can be used to help with others. This idea has been 
developed in different ways, including in seed methods [S] and with Richardson it- 
eration Deflation methods can also be used (see [571 and their references). 
Deflation approaches appearing in QCD literature include |9l [121 El IM] • Deflation 
involves computing eigenvectors corresponding to the smallest eigenvalues (or other 
outstanding eigenvalues) and using them to remove the eigenvalues from the effective 
spectrum for the iterative method. GMRES-DR [57] is a deflation method that has 
been adapted for multiple right-hand sides Some details of GMRES-DR are 

given in the next section. 

Krylov methods have been developed for shifted matrix problems. Again, some 
details are in the next section. 

The goal of this paper is to develop deflated methods for dealing with multiple 
shifts along with both a single right-hand side and multiple right-hand sides. Section 
2 has background material on current methods for solving shifted systems and on 
deflated methods for multiple right-hand sides. In Section 3, deflation is used for 
multiply shifted systems with a single right-hand side, or the first of several right- 
hand sides. A multiply-shifted version of GMRES-DR is developed. There is also some 
comparison of GMRES with FOM for multiple shifts. Section 4 looks at solving the 
second and subsequent right-hand sides using eigenvector information and GMRES. 
Then the next section deals with the case of related right-hand sides. 

2. Review. 

2.1. Krylov methods for shifted systems. The Krylov subspace generated 
with the shifted matrix A — cr,/ and with starting vector vq is 



Krylov subspaces are shift invariant in that the subspace is the same regardless of 
the choice of ai. Therefore, one Krylov subspace can be used to solve several shifted 
systems as long as all the systems have the same right-hand side, or at least right- 
hand sides that are multiples of each other. Note that we are assuming there is 
no preconditioning. With a preconditioner, systems with different shifts would no 
longer have equivalent Krylov subspaces. So for the non-preconditioned case, it is 
fairly straightforward to develop versions of nonrcstarted Krylov methods for multiply 
shifted systems. Such versions have been given for the conjugate gradient method [141 
[45], nonrestarted GMRES [8], and both QMR and TFQMR [16]. A multiply shifted 
version of BiCGStab has also been developed [TBI. 




Restarted Krylov methods are not as straightforward. After a restart, all systems 
need to have parallel right-hand sides. This means that all residual vectors formed 
at the end of a cycle of the Krylov method need to be multiples of each other. For 
the FOM method 37J, this happens automatically [40]. For GMRES [31[37], the 
residuals can be forced to be parallel, as shown by Frommer and Glassner p9^. To 
see how this is possible, we need the Arnoldi recurrence |37j : 



(2.1) 



Span{ro, {A - crJ)ro, {A - crJ)^ro, {A - ct;/)" Vq}. 




(2.2) 



AVm — Vm+lHm, 
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where Vm is a n by fc matrix whose cohimns span the Krylov subspace, V„i^i is the 
same except for an extra column and H„i is an upper-Hessenberg m + 1 by m matrix. 
For a shifted system, this becomes 

{A - (jJ)V,n = V,n+i{H,n ~ (tJ), 

where / is the to + 1 by m identity matrix. 

Say that after a restart the approximate solution for the ith shifted system is io,i 
and the residual vector is r^^ i = PirQ i. So it is assumed that the residual is parallel 
to that of the base shift. The system is then 

{A - cril){xi - xo^i) = ro,i = /^^roj. 

For the next cycle, let the base system have standard GMRES solution Vmdi, where 
c?i is the solution of min||c — {H„i — crj/)di||, with ro.i = Vm+ic. Let be the new 
residual vector for the ith shifted system after this cycle. We need the approximate 
solution Vmdi to be chosen so that = /3"*^'"ri, for some scalar So 

(2.3) f3,ro,i -{A- a,I)Vmd, = /^."^'"(roa ~ {A ~ cj^)Vmd{). 
After using the Arnoldi recurrence (|2.2p . we have 

(2.4) Vm+l{|3^C - {Hm - <J^T)d^) = P^'^^Vm+lic - {H,n " CTj)di). 

Next the Vm+i can be dropped, and we let s = c — {H„i — ail)di and rearrange: 

We now use a QR factorization, H,n — ail = QR, with Q being an to + 1 by m + 1 
orthogonal matrix and R being m + 1 by to upper-triangular, and get 

The value of /Jf^™ can be determined from the last row (note the left side of the 
equation has zero in the last row). Then solution of an upper-triangular system 
determines di. 

A shortcut formula for the residual norm of the ith shifted system is 

(2.5) \\n\\ = mc - {Hm ~ 'y^I)d,\\■ 

The new approximate solution is Xi = xo^i + Vmdi. The systems then become A(xi — 
Xi) — P"'^"'ri. For more details, see |l9j. We will refer to this approach as GMRES- 
Shifts or GMRES-Sh. Note that only the base system has the minimum residual 
property. The solution of the other shifted systems is not equivalent to GMRES 
applied to those systems. This is in contrast to FOM-Shifts HI], which has each 
shifted system solved with the FOM approach. Like FOM-Sh, GMRES-Sh uses the 
same polynomial (after shifting) for all shifted systems p]9l |40] . 

2.2. Deflated GMRES. SmaU subspaces for restarted GMRES can slow con- 
vergence for difficult problems. Deflated versions of restarted GMRES [Ml [HI [131 
[Sl[3Hl[l[lll[ini[2Slll7] can improve this, when the problem is difhcult due to a 
few small eigenvalues. One of these approaches is related to Sorensen's implicitly 
restarted Arnoldi method for eigenvalues [l^ and is called GMRES with implicit 
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restarting [26 . A mathematically equivalent method, called GMRES with deflated 
restarting (GMRES-DR) [57j, is also related to Wu and Simon's restarted symmetric 
Lanczos eigenvalue method 01] . See ^I5[ H51 for some other related eigenvalue 
methods. 

We will concentrate on GMRES-DR [27] , because it is efflcient and relatively sim- 
ple. Approximate eigenvectors corresponding to the small eigenvalues are computed 
at the end of each cycle and are put at the beginning of the next subspace. Letting 
tq be the initial residual for the linear equations at the start of the new cycle and 
yi, . . .yk be harmonic Ritz vectors [TS] [3S1 [31] , the subspace of dimension m used 
for the new cycle of GMRES-DR(m,k) is 

(2.6) Span{yi,y2, ...yk,ro, Aro, A^o, A^tq, ^'"-'=-^0}. 

This can be viewed as a Krylov subspace generated with starting vector tq augmented 
with approximate eigenvectors. Remarkably, the whole subspace turns out to be a 
Krylov subspace itself (though not with 7'o as starting vector) [26]. Once the ap- 
proximate eigenvectors are moderately accurate, their inclusion in the subspace for 
GMRES essentially deflates the corresponding eigenvalues from the linear equations 
problem. 

GMRES-DR generates a recurrence similar to the Arnoldi recurrence (|2.2p . It is 

(2.7) AVr,, - Vm+lHm, 

where Vm is a n by m matrix whose columns span the subspace (|2.6p . Vm+i is the 
same except for an extra column and is an m -I- 1 by m matrix that is upper- 
Hessenberg except for a full fc -I- 1 by fc leading portion. A part of this recurrence can 
be separated out to give 

(2.8) AVk = Vk+iHk, 

where Vfc is a n by fc matrix whose columns span the subspace of approximate eigen- 
vectors, Vk+i is the same except for an extra column and Hk is a full k + 1 by k matrix. 
This recurrence allows access to both the approximate eigenvectors and their products 
with A while requiring storage of only fc -I- 1 vectors of length n. The approximate 
eigenvectors in GMRES-DR actually span a small Krylov subspace of dimension fc. 

2.3. Deflated GMRES for multiple right-hand sides. The multiple right- 
hand side approach that will be adapted here for multiple shifts is called GMRES-Proj . 
The flrst right-hand side is solved with GMRES-DR, then the eigenvector information 
thus generated is used to deflate eigenvalues from other right-hand sides [5^. This 
is done with a simple minimum residual projection over the eigenvectors alternated 
with cycles of regular GMRES (m). The expense of the projection does not generally 
add much to the GMRES cost. 

The algorithm for the projection is given next. It projects over the space of 
harmonic Ritz vectors spanned by the columns of Vk in Equation (|2.8p . This requires 
only 3fc -|- 2 vector operations of length n. 

Minres Projection for Vk 

1. Let the current approximate solution be xq and the current system of equa- 
tions be A{x — xq) = rg. Let Vk+i and Hk be from Equation (12. 8|) . 

2. Solve min||c — Hkd\\, where c — (Vfc+i)"^ro. 

3. The new approximate solution is a; = aio + Vkd. 
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4. The new residual vector is r = vq — AV^d — — Vk+iH^d. 

The GMRES-Proj method that follows is for all right-hand sides except for the 
first one. See [SD] for more details. 

GMRES(m)-Proj(k) 

1. After applying the initial guess xo, let the system of equations be A{x — xo) = 
ro- 

2. If it is known that the right-hand sides are related, project over the previous 
computed solution vectors. 

3. Apply the Minres Projection for Vk- This uses the Vk+i and Hk matrices 
developed while solving the first right-hand side with GMRES-DR. 

4. Apply one cycle of GMRES(m). 

5. Test the residual norm for convergence (can also test during the GMRES 
cycles). If not satisfied, go back to step 3. 

3. Deflated GMRES with Multiple Shifts. Subsection 2.1 gave some details 
of how restarted GMRES can be implemented in order to simultaneously solve mul- 
tiply shifted systems. For the deflated restarted GMRES method GMRES-DR, we 
can also solve multiply shifted systems concurrently. The key is that GMRES-DR has 
subspaces that are Krylov subspaces (as mentioned in Section 2.2), and they always 
contain the current right-hand side vector. The derivation in Subsection 2.1 for solv- 
ing multiply shifted systems with GMRES also applies here with very slight change. 
Because GMRES-DR has Krylov subspaces, it has the Arnold-like recurrence (|2.7p 
which can be used in place of GMRES 's Arnoldi recurrence (|2.2p {Hm is not upper- 
Hessenberg, but this does not affect the derivation). Also, since the subspaces contain 
the right-hand side, is again Vjn+i{(3ic) for some c, but here c is not a multiple 
of ei. Next is the algorithm for the shifting part of the new method that we call 
GMRES-DR wfth shifts or GMRES-DR-Sh. It is the same as for GMRES-Sh, except 
it uses GMRES-DR instead of GMRES and has the different form of i?,„. See [2?! for 
more on the GMRES-DR portion. See [21] for a multi-shifted version of the related 
method GMRES-E 24J. 

GMRES-DR-Sh 

1. At the beginning of a cycle of GMRES-DR-Sh, assume the current problem 
is (A — Uil){xi — io,i) = Piro^i, with f3i = 1, and where io,i is the current 
approximate solution to the ith shifted system. 

2. Apply GMRES-DR to A and generate Equation ^jTT} : AV^ = Vm+iHra- 

3. For the base system, solve the minimum residual reduced problem min||c — 
{Hm — <^lI)\\^ where c = V^j^^rQ^i and / is the m -1- 1 by to identity matrix. 
The new approximate solution is xi = io.i + Vmdi. The new residual vector 

IS ri = TQ^l - AVmdl = TQ^l -Vm+lHmdi. 

4. For the other shifted systems i = 2, . . . ns, form s = c — {Hm — ail)di. Apply 
a QR factorization: Hm — (Jil — QR. Solve Rdi — (3iQ'^c — using 
the last row to solve for (3"^"^ and the first to rows for di. 

5. The new approximate solution of the ith system is Xi — xq,,; + Vmdi, and the 
new residual is = (if'^ri. 

6. Test the residual norm for convergence. If not satisfied, for i — 2 . . .ns, set 
Pi = Z?"*^™ and for i = 1 . . . ns, set io,i = Xi and ro,i = r^. Then go back to 
step 1. 
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Fig. 3.1. Solution of a system with multiple shifts. 

Example 1. We test the shifted version of deflated GMRES and compare it to 
shifted regular GMRES. The matrix has n = 1000 and is bidiagonal with 0.1, 1, 2, 3, ... , 
998,999 on the main diagonal and I's on the supcrdiagonal. The right-hand side is 
generated randomly. The shifts are cr = 0, —0.4, —2. GMRES(25)-Sh is compared 
with GMRES-DR(25,10)-Sh. The resuhs are given in Figure 3.1. This problem has 
small eigenvalues which slow down restarted GMRES, particularly for the base system 
with cr = 0. Shifting the matrix even by just 0.4 improves convergence of GMRES(25), 
mainly because the smallest eigenvalue is moved from 0.1 to 0.5. GMRES-DR con- 
verges very rapidly once it generates approximations to the eigenvectors corresponding 
to these eigenvalues. The convergence of GMRES-DR for all three shifted systems 
is similar, since once the small eigenvalue are essentially removed by the deflation, 
shifting does not have such an important effect. 

In Example 1, the second and third shifted systems converge faster than the first 
system. However, in some situations, there can be convergence problems for the non- 
base systems [40j. Simoncini 0^ compares multiply shifted GMRES and FOM. Since 
FOM automatically has residuals parallel for all shifted systems, it can be argued that 
it is a more natural approach [40'. However, it is also a matter of where the roots 
of the FOM and GMRES polynomials fall in relation to the shift. The next example 
demonstrates this. As mentioned earlier, GMRES-Shifts uses the same polynomial 
for all shifted systems PTS]. Likewise, FOM-Shifts sticks with one polynomial for all 
shifts. The regular GMRES polynomial for an unshifted matrix is scaled to be 1 at 
zero and needs to be small over the spectrum. For shifted systems, we have a choice of 
viewing the polynomial as being 1 at zero and the spectrum shifted or the polynomial 
being 1 at the shift and the spectrum fixed as that of A. We chose the later. So 
for GMRES-Shifts with the base system A — ail, we view the polynomial chosen by 
GMRES as being 1 at cri and needing to be small over the spectrum of A. In the 
next two examples, plots are given of the roots of these polynomials. Because the 
polynomial needs to be at somewhat small over the spectrum of A, a small value of 
the polynomial at the shift can cause a problem. With the normalizing, there may 
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Fig. 3.2. Shifted GMRES and FOM 

then be large values at eigenvalues of A. So for a Krylov method to be effective, the 
roots of the polynomials need to generally stay away from the shift. 

Example 2. For the same matrix as in Example 1, we apply GMRES(40)-Sh 
and FOM(40)-Sh with shifts cr = 0.4, 0. The results are shown in Figure 3.2. For 
the base shift of cr = 0.4, GMRES works better, but FOM is more effective on the 
next shift. Figure 3.3 shows the harmonic Ritz values |23j i35j nearest the shift for 
50 cycles of GMRES. These are the roots of the GMRES polynomial for the system 
{A — 0.47)2:1 — b, shifted so they correspond with the spectrum of A. The harmonic 
Ritz values avoid the region around 0.4, which is good, since the shifted GMRES 
polynomial for cr = 0.4 needs to have value 1 at 0.4 and be somewhat small over the 
spectrum of A. This polynomial may not be effective if it has a root near to 0.4. 
GMRES(40) for a = 0.4 is able to slowly converge for this fairly difficult, indefinite 
problem. Meanwhile, FOM(40) with a = 0.4 does not converge, because as shown in 
Figure 3.3, the roots of the shifted FOM polynomials, which are the Ritz values, are 
not separated from 0.4. For the second shift value of 0, GMRES-Sh is not effective. 
Too many harmonic Ritz values occur not only near 0, but also on both sides of 
it. FOM gives erratic convergence because of some Ritz values near 0, but it does 
converge. Although not included in Figure 3.3, we also tested a third shift ct = — 1. 
GMRES-Sh again does not converge because of harmonic Ritz values near -1. This 
is in spite of the fact that GMRES would have no trouble if this was the base shift. 
The Ritz values for FOM do not occur near 1, so FOM converges to under 10~* in 
400 iterations. See 00] for more examples in which FOM is better for shifted systems 
than GMRES. We next give an example with GMRES more effective. 

Example 3. The matrix is the same as in the previous example. The base shift 
is CTi and a second shift of a2 = 1-4 is used. GMRES-Sh(80) is compared with 
FOM-Sh(80). The results are in Figure 3.4. Some of the Ritz values fall around 
1.4, while there is a gap in the harmonic Ritz values; see Figure 3.5. So shifted 
GMRES works much better for the second system. We conclude that methods for 
multi-shifted GMRES and FOM must both be used with some caution. However, 
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Harmonic Ritz Values with o = .4, Real Parts 
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Fig. 3.3. Distribution of Smallest Harmonic and Regular Ritz values, m=40, 50 cycles 
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GMRES(80)-Sh, o - 1.4 - dotted line 
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Fig. 3.4. GMRES-Sh versus FOM-Sh 



deflating eigenvalues can help. Figure 3.4 also has a plot of FOM-DR-Sh(80,2) for 
(J = 1.4, and we see that removing just two eigenvalues fixes the trouble. For the 
examples in the rest of this paper, the matrices and shifts are such that the base 
systems are the ones with the slowest convergence. 

4. Deflated GMRES for multiple right-hand sides and multiple shifts. 

We now consider solving multiply shifted systems that also have multiple right-hand 
sides. It is important to reuse information or share information among the right-hand 
sides. It is possible to design multi-shifted versions of both Block-GMRES f37^ and 
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Fig. 3.5. Distribution of Smallest Harmonic and Regular Ritz values, m=80, 25 cycles 

Block-GMRES-DR |28j. However, here we will concentrate on a non-block approach. 
The right-hand sides are solved separately, and eigenvector information from solution 
of the first right-hand side is used to assist the subsequent ones. More specifically, we 
will generalize for multiple shifts the GMRES-Proj approach mentioned in Section 2. 
See [30] for more on this method, including comparison with block methods. 

First some of the difficulties of deflating for subsequent right-hand sides will be 
discussed. Suppose the first right-hand side has been solved and approximate eigen- 
vectors have been generated. Then for the non-shifted case, there are several ways to 
deflate eigenvalues. Some of these are given in [301 ■ However, generally they do not 
work for multiply shifted systems. For example, some deflation approaches involve 
building a preconditioner from the approximate eigenvectors [H [31 [3D] . As mentioned 
earlier, shifted systems cannot be solved together if there is preconditioning. 

For the GMRES-Proj method, there is trouble with one of the two phases. We 
know the GMRES portion can be adapted to keep right-hand sides parallel for multiple 
shifts. However, the phase with projection over approximate eigenvectors generally 
fails to produce parallel residual vectors. Even though this projection is over a Krylov 
subspace of dimension k spanned by the columns of 14, this subspace does not contain 
the current right-hand side (the residual vector). So the derivation in Section 2.1 does 
not work with Vk+i and Hk from (|2.8p replacing Vm+i and Hm from (|2.2p . Specifically, 
the transition from Equation (|2.3p to Equation (|2.4p is not possible since r^^i and ro.i 
are not in the span of the columns of Vk+i- One case where the projection does keep 
the residual vectors parallel is with exact eigenvectors. We now show this. 

Theorem 4.f . Assume that before the minres projection, the shifted systems are 

{A - ail){xi - xo,i) = ro^i 

with ro.i — f3iro.i for i — 2, . . . , ns. Let zi, Z2, . . . , Zfc be eigenvectors of A. Then after 
the minres projection over the subspace Span{zi, Z2, ■ ■ ■ , Zk}, the residual vectors are 
parallel. 
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Proof. Let Z^. be the matrix with zi, . . . , Zk as columns. For exact eigenvectors, 
the minres projection is equivalent to Galerkin [37j . The Galerkin projection gives 
the reduced problem: 

Solving gives 

d, = A(Afe - a,4)-i(ZjZfc)-iZjro,i, 

where A^, is the fc by fc diagonal matrix with diagonal entries Ai through A/j and Ik is 
the k hy k identity matrix. The residual vector after projecting is then 

Ti = (3iro,i - {A - aiI)Zkdi 

= P^ra^ - HA - (jd)Zk{hk - (Jdky\zlZk)~^ZlrQ,i 
= Aro,i - ^3^Zk{^k - <J^Jk){Ak - a,4)-i(ZjZfc)"izJro,i 
= A(/-Zfe(ZfZfe)-izJ)ro,i. 

This shows that all are multiples of each other. □ 

So one option for GMRES-Proj with multiple shifts is to use only fairly accu- 
rate eigenvectors. We could sort through the approximate eigenvectors computed 
by GMRES-DR and apply only ones with acceptible accuracy to the projection in 
GMRES-Proj. This is now tested. 

Example 4- For the same matrix as in Example 1, we first solve the ai — 
system to accuracy of relative residual norm below 10^^°. This takes 250 matrix- 
vector products. The second column of Table 4.1 shows the accuracy of the fcth 
eigenvectors thus produced. For example, the residual norm of the tenth approximate 
eigenvectors is 4.1e-2, while the fourth is much better at l.Oe-6. Now we consider 
solution of the second right-hand side using the first k eigenvectors with k from 10 
down to 2. The third column gives the number of matrix- vector products for the 
relative residual norm of the base shifted system with the second right-hand side to 
reach l.e-lO. We see that convergence is better using all ten eigenvectors. However, 
the fourth column gives the acccuracy attained by the worst of the last two shifted 
systems (usually it is the third shift). It only reaches residual norm of 3.3e-4 if all 10 
approximate eigenvectors are used in the projection. With only four eigenvectors, the 
residual norm reaches a better level of 5.4e-8, but the convergence is almost twice as 
slow. 

The last three columns repeat this information for the case of solving the first 
right-hand side system to greater accuracy of relative residual norm of l.e-14 (385 
matrix- vector products). The second right-hand side systems are then able to be 
solved more accuracy. Even with k — 8 eigenvectors, all shifted systems reach ac- 
curacy of 1.3e-9 or better compared to 3.7e-5 for the previous case of less accurate 
eigenvectors. 

The problem with this approach is that the eigenvector computation during so- 
lution of the first right-hand side needs to be done to considerable accuracy, since we 
do not want to slow down convergence of the subsequent systems. If many right-hand 
sides are to be solved, this extra expense might not be significant. However, we next 
propose an approach without this concern of needing accurate eigenvectors. 

The key idea is that although the residual vectors cannot be kept parallel, they 
can be chosen so that they relate to each other. We force the residuals of the non- 
base systems to be parallel to the residual of the base system except for a component 
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Table 4.1 

Effect of projecting over different accuracies of eigenvectors 





250 mvp's for 1st 






385 mvp's 1st 






k 


eig. res. 


mvp's 


lin. eqs. res. 


eig. res. 


mvp's 


lin. eqs. res. 


10 


4.1C-2 


135 


3.3e-4 


4.4e-l 


135 


1.7e-6 


8 


3.3C-3 


165 


3.7e-5 


7.1e-6 


165 


1.3e-9 


6 


8.0O-5 


180 


2.8e-6 


3.3C-10 


180 


1.3e-9 


4 


l.Oe-6 


255 


5.4e-8 


2.4e-ll 


255 


5.5e-10 


2 


5.4e-9 


435 


1.5e-8 


9.9e-12 


435 


3.6e-10 



in the direction of Wfc+i, the last column of the Vk+i matrix from Equation ()2.8|) . 
We then continue solving but ignore this component. At the end, a correction can 
be done. However, for this correction, we need solution of shifted systems with one 
additional right-hand side, namely Vk+i- We begin discussion of this approach with 
the aspect of keeping residuals parallel except for the one component, then move on 
to the correction phase. 

Suppose we have shifted systems with parallel residuals, namely {A — ail){xi — 
io.i) — /3i7'o,i, with Pi = I. A projection over the columns of Vk can be implemented 
to give for the base system a new approximate solution xi and residual ri and for 
the other systems new approximate solutions Xi and residuals PiVi such that {A — 
ail){xi — ii) — ri, and 

(4.1) (A - (7il){xi - Xi) = ftri + 7iWfc+i, 

for i — 2, . . . , fc. First apply minres projection over the approximate eigenvectors 
spanned by the columns of Vk to the base system. The base system residual vector 
is then ri — ro,i — AVk+idi. We need di to be chosen so that = /S^ri + ^iVk+i-, for 
some scalar 7^. So we need 

ro,i - {A- aiI)Vkdi = /3i(ro,i - (A - aiI)Vkdi) +jiVk+i. 

After using that ro,i — Piro,i and using the key recurrence for the approximate eigen- 
vectors p.8|) . we have 

Vk+i{Hk - cril)di = PiVk+i{Hk - (yj)di - -fiVk+i- 

Next the Vk+i can be dropped, and we let t — (3i{Hk ~ ail)di: 

(4.2) {Hk-cril)di=t--fiek+i, 

where ek+i is the k + 1st coordinate vector of length fc -I- 1. We ignore the last row 
of (|4.2p and solve the first k equations for the unknown vector di of length fc. Then 
Equation (14. 2p is automatically true for some 7^. So we have (|4.ip . Note the /3i's 
do not change during this projection over approximate eigenvectors, unlike in the 
GMRES-Sh portion. 

Now we will look at the correction phase that is needed at the end of GMRES- 
Proj. Assume that we have already solved shifted systems with the extra right-hand 
side Vk+i (this solution will be discussed next) and have 



(4.3) 



{A - (Til)Si = Vk+l- 
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We assume that for a particular right hand side, the systems have been solved by 
GMRES-Proj to the point that the residual is only in the direction of Ufc+i and the 
system is recast as 

(4.4) {A~ <Tj){xi - Xi) = u = jiVk+i, 

for i = 2, . . . ,ns and for some scalar 7^. Here Xi is the approximate solution to Xi. 
We perform a Galerkin projection for system (|4.4p over the subspace spanned by the 
single vector Si from solution of (|4.3p : 

sJ{A - aiI)siS = jisjvk+i- 

Using Equation (|4.3p . this becomes 

sfvk+iS = ^isjvk+i- 

Then 5 — ^i. To determine 7^, we start with — ^iVk+i- Multiplying both sides by 
"^k+i ^^'^ using that Wfc+i is of unit length gives 



(4.5) 7^ 



The corrected solution of the system is Xi ^ Xi + 5si = + v'^^-^^riSi. 

We need to fill in the method for solving (|4.3p . the shifted systems with the extra 
right-hand side Vk+i- First, GMRES-Proj is applied. This includes projection over 
the approximate eigenvectors as described above alternating with cycles of GMRES. 
This continues until the residual is negligible except in the direction of Vk+i- So for 
the correction, we assume that 

(4.6) {A - aJ)S., = r 
is the current system, where 

r = Vk+i - {A - crj)si = 7iWfc+i, 
for some scalar 7^. Rearranging this residual equation gives 

(4.7) (.4 - crJ)S, = (1 - 7,)wfc+i . 

Applying Galerkin projection over the subspace spanned by the single vector Si to the 
system (|4.6p gives 

sf{A - aiI)siS = -iisjvk+i- 
With Equation (|4.7p . this becomes 

(1 - li)sjvk+i5 = %sjvk+i, 

and this simplifies to 

1 - 7« 

So the corrected solution is Si = Si + j^^Si = j^^Si. Finally, the 7^ can be determined 
to be 7i = v'^j^-^ri as it was for (|4.5p . 

We next list the algorithms for solution of the systems with second and subsequent 
right-hand sides and with the extra right-hand side. They are given in the order they 
were derived here, not in order of how they are actually used. 
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GMRES-Proj-Sh for the second and subsequent right-hand sides 

1. Consider the systems with the jth right-hand side (and with aU ns shifts). 

2. At the beginning of a cycle of GMRES-Proj-Sh, assume the current problem 
is (A — ail){xi — io,i) = Pifo^i, with f3i = 1, and where xa^i is the current 
approximate solution to the ith shifted system. 

3. Apply the Minres Projection for Vk to the base shift. This uses the Vk+i and 
Hk matrices developed while solving the first right-hand side with GMRES- 
DR-Sh. 

4. For shifted systems is = 2 . . . ns, solve {Hk — (Jil)di = f3i{Hk — (7i/)di, where 
Hk is the fc by A: portion of Hk- Set the new approximate solution as Xi = 
xo.i + Vkdi. 

5. Apply one cycle of GMRES(m)-Sh. 

6. Test the residual norms for convergence (can also test during the GMRES 
cycles). For the non-base systems, ignore the error term in the direction of 
Vk+i- Residual formula (|2.5p can be used. If not satisfied, go back to step 2. 
Otherwise conclude with step 7. 

7. Correction phase: Suppose the computed solution so far for the ith shifted 
system is Xi. Let the solution to the system with the extra right hand side 
Vk+i and shift ai be Si. The corrected solution is Xi = Xi + {v'^_^_^r)si and the 
corrected residual norm can be calculated. 

GMRES-Proj-Sh for the extra right-hand side Vk+i 
Same as for previous algorithm except for 

1. Consider the systems with right-hand side Vk+i (and with all ns shifts). 
7. Correction phase: Suppose the computed solution for the ith shifted system 
so far is Si. The corrected solution is = ( ) * Si, with 7^ = v^j^-^r. 

Example 5. We use the same test matrix. All right-hand sides are generated ran- 
domly. The systems with the first right-hand side are solved with GMRES-DR(25,10)- 
Sh as in Example 1. Then the extra right-hand Vk+i systems are solved (for all shifts) 
with GMRES(15)-Pr(10)-Sh. Finally, the second right-hand side systems are also 
solved with GMRES(15)-Pr(10)-Sh. All relative residual tolerances are rtol — l.e — 6. 
Figure 4.1 has residual curves for only two shifts, the base shifts of zero and a = —2. 
The solid line is the base shift. The dotted line shows the uncorrected residual norm 
for the second shift, while the dash-dot line has the second shift residuals if they 
are corrected (actually the correction needs to be done only once at the end). The 
uncorrected residual norm for the second right-hand side and second shift levels off 
at 4.e-3, but this is fixed by the correction phase. The convergence is faster than for 
GMRES-DR-Sh, because the eigenvectors are used from the beginning to speed up 
the convergence. Also the cost of GMRES(15)-Proj(10)-Sh is less than for GMRES- 
DR(25,10)-Sh, because it is fairly inexpensive to project over the approximate eigen- 
vectors compared to keeping the eigenvectors in the GMRES-DR subspace. Here the 
expense for the extra right-hand side is fairly significant, however it will not be if 
there are more right-hand sides. Figure 4.2 has the case of solving a total of 10 right- 
hand sides. Also, the extra right-hand side is solved only to relative residual tolerance 
of l.e-3. Now the expense for the extra right-hand side Vk+i is small compared to 
amount saved by speeding up solution of all the right-hand sides. 

Example 6. At the end of the previous example, the extra right-hand side is 
solved to low accuracy, but the correction for the subsequent right-hand sides is still 
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Fig. 4.1. Solution of first rhs, extra rhs and second rhs with two shifts. 




Fig. 4.2. Solution often right-hand sides with two shifts. 



successful. We now experiment with solving the extra right-hand side to different 
levels of accuracy. Table 4.2 shows the accuracy after correction for the a — —2 
system when the extra right-hand side system is solved to relative residual tolerances 
ranging from l.e-6 down to l.e-1 (the tolerance is checked for termination only at the 
end of GMRES cycles). The first and second right-hand side systems are solved to 
three different relative residual norm tolerances (l.e-6, l.e-8 and l.e-10) in the three 
rows of the table. The conclusion of this experiment is that the extra right-hand side 
systems do not need to be solved very accurately. For instance, with desired tolerance 
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Table 4.2 

Effect of solving the extra right-hand side system to different accuracies 



desired rtol 
of 1st and 2nd sys's 


accurracy of 2nd 
before correction 


l.e-6 


l.e-5 


l.e-4 


l.e-3 


l.e-2 


l.e-1 


l.c-6 


4.2C-3 


4.8e-6 


4.8C-6 


4.8e-6 


4.9e-6 


6.4e-6 


3.8e-4 


l.c-8 


3.6C-4 


2.4e-8 


2.4C-8 


2.4e-8 


2.5e-7 


9.4e-7 


9.9e-5 


l.c-10 


1.2e-3 


1.5e-10 


2.7e-10 


l.Oe-9 


9.7e-8 


2.7e-7 


3.1e-5 



l.e-8 for the first and second right-hand sides, the extra right-hand side systems need 
only to be solved to relative tolerance of l.e-3 to get accuracy of 2.5e-7 (and relative 
accuracy of 7.9e-9) for the second right-hand side system. 

The next example is probably the key example in the paper. It shows the value 
of deflating eigenvalues for an important application. QCD problems often have the 
need of solution of multiple right-hand sides and multiple shifts for each right-hand 
side. They also have complex spectra with small eigenvalues for the problems of most 
interest. 

Example 7. We look at a Wilson-Dirac matrix from lattice QCD [29j . The matrix 
is complex and the dimension is 393,216 by 393,216. The value of k is 0.158 for 
the base shift. This is approximately ^critical- The eigenvalues are in the right-half 
of the complex plane, but partially surround the origin [29]. The right-hand sides 
are unit vectors associated with particular space-time, Dirac and color coordinates. 
Often there are a dozen or more right-hand sides associated with each matrix and 
perhaps seven shifts for each right-hand side. We will just show solution of the second 
right-hand side for three shifts, cr = 0, —0.3, —0.5. The first right-hand side is solved 
with GMRES-DR(50,30) to residual tolerance of l.e-8 and the extra right-hand side 
to l.e-7. Then for the second right-hand side, GMRES-Proj uses 30 approximate 
eigenvectors for the projection in between cycles of GMRES(20). See Figure 4.3 for 
the results. GMRES(20)-Proj(30)-Sh converges in about one-tenth of the iterations 
needed for GMRES(20). To reach residual norm of less than 10^^ for the toughest 
system with shift of zero takes 2680 matrix- vector products for GMRES(20)-Sh and 
280 for GMRES(20)-Proj(30)-Sh. 

5. Related Right-hand Sides. We need to take advantage of any relationship 
between the right-hand sides. This section shows that this can easily be done even 
for the case of multiple shifts. 

Assume we have solutions of all shifted systems for right-hand sides 1 through j. 
So we have 

[A - <7j)xf^' = b'''^\ 
for irhs = 1, . . . , j — 1. We put these equations together to form 
(5.1) iA~aJ)X,^B, 

where B is the n by j — 1 matrix with columns through b-'~^, and Xi has columns 
x] through xi~^. We assume there is no initial guess. Applying Minres projection 
over the subspace spanned by the columns of Xi to the system with the right-hand 
side j and shift a.i gives 



X^iA - adfiA - aJ)X,d - (A - ajfV. 
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Matrix-Vector Products 
Fig. 4.3. Solution of ten related right-hand sides. 

With (|5.ip . this becomes 

(5.2) B'^Bd = B'^V. 

Note that the solution of Equation ()5.2|) is independent of the shift. This makes the 
residual vectors all the same: 

n = V - Ed. 

The approximate solutions are 

il = X,d. 

So this approach projects over all of the previous solutions for each shifted system 
and provides the needed parallel residuals. 

Example 8. We repeat the test in Example 5 with 10 right-hand sides, except 
this time they are related to each other. We define the second and subsequent right- 
hand sides as V = + 10"'' * u\ where is a random vector (both and 
have elements distributed normally with mean and variance 1). Before solving the 
second and subsequent right-hand sides, we project over the previous solutions as just 
described. The results are in Figure 5.1. Using the close relationship between the 
right-hand sides allows the number of matrix-vector products to be cut in half for 
each of the subsequent right-hand sides. 

6. Conclusion. Deflating eigenvalues can significantly improve restarted GM- 
RES for matrices with small eigenvalues. This work focuses on deflated GMRES for 
the case of multiple shifts of the matrix. When using Krylov methods to solve systems 
of equations with multiple shifts, the goal is to solve all systems with about the same 
expense as one system. Past work has developed such methods for non-restarted and 
even restarted Krylov methods. Here this is extended for deflated, restarted GMRES 
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Fig. 5.1. Solution of first rhs, extra rhs and second rhs with two shifts. 



methods, both in the case of a single right-hand side and for multiple right-hand sides. 
For multiple right-hand sides, there is added expense for solving an extra right-hand 
side, but this allows all shifted systems to be solved simultaneously for as many right- 
hand sides as are desired. Also, it is possible to efficiently take advantage of closely 
related right-hand sides, even in this case of multiple shifts. These approaches can be 
very beneficial for an important application in lattice QCD physics. 

Future work could examine other QCD problems including the overlap fermion 
problem, which involves solving an inner-outer loop. There may be potential for 
deffation in both of the loops. Deflation of non-restarted methods such as BiCGStab 
should also be investigated. Another interesting topic would be solution of systems 
with changing shifts, such as may occur in model order reduction [20j . 
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